Optimze Gelu with MKL Erf function by yihuaxu · Pull Request #15770 · PaddlePaddle/Paddle

yihuaxu · 2019-02-18T07:52:33Z

According to the performance status of Bert model, optimized GELU operator to accelerate the data processing.

Platform: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Model Path: third_party/inference_demo/bert_emb128/model
Batch Size: 1
Command: ./paddle/fluid/inference/tests/api/test_analyzer_bert --infer_model=third_party/inference_demo/bert_emb128/model/ --infer_data=third_party/inference_demo/bert_emb128/data.txt --gtest_filter=Analyzer_bert.profile --paddle_num_threads=1 --repeat=1 --batch_size=1 --test_all_data --profile
Data Source: third_party/inference_demo/bert_emb128/data.txt.

The following is the comparison with the different scenarios.

test=develop

… mkl kernel. test=develop

test=develop

yihuaxu · 2019-02-19T03:21:18Z

start a review

luotao1 · 2019-02-19T03:24:57Z

cmake/external/mklml.cmake

    SET(MKLML_SHARED_IOMP_LIB     ${MKLML_LIB_DIR}/libiomp5md.dll)
 ELSE()  
-    SET(MKLML_VER "mklml_lnx_${TIME_VERSION}" CACHE STRING "" FORCE)
+    SET(MKLML_VER "VsErf_mklml_lnx_${TIME_VERSION}" CACHE STRING "" FORCE)


Please add a comment to show this is a temporary mklml lib including erf, like
TODO(intel-huying)?

test=develop

tensor-tang · 2019-02-19T13:41:33Z

paddle/fluid/operators/math/blas.h

  template <typename T>
  void VINV(int n, const T* a, T* y) const;

+#ifdef PADDLE_WITH_MKLML


Do not add #ifdef here.

You can make this function general.

Can refer to VMUL

paddle/fluid/operators/activation_op.h

tensor-tang · 2019-02-19T13:44:13Z

paddle/fluid/operators/activation_op.h

+    std::memset(out_data, 0, n * sizeof(T));
+    math::CBlas<T>::AXPY(n, static_cast<T>(M_SQRT1_2), x_data, 1, out_data, 1);
+    math::CBlas<T>::VMERF(n, out_data, out_data, VML_LA);
+    for (int i = 0; i < n; i++) out_data[i] += static_cast<T>(1);


code style

for () { ... }

same below.

test=develop

tensor-tang

LGTM

yihuaxu · 2019-02-21T00:25:51Z

@panyx0718 Please help me review this PR because the key files are changed.

[10:23:00] + echo 'current pr 15770 got approvals: FALSE'
[10:23:00] + '[' FALSE == FALSE ']'
[10:23:00] + echo 'You must have panyx0718 approval for the api change! cmake/external'
[10:23:00] + exit 1
[10:23:07] Process exited with code 1

tensor-tang · 2019-02-21T15:33:16Z

新的profile #15301 貌似有dependency问题 http://ci.paddlepaddle.org/viewLog.html?tab=buildLog&logTab=tree&filter=debug&expand=all&buildId=61983&_focus=9247

In file included from /paddle/paddle/fluid/platform/profiler.h:20:0,
[15:25:05]                 from /paddle/paddle/fluid/platform/device_tracer.cc:33:
[15:25:05]/paddle/paddle/fluid/platform/device_context.h:30:22: fatal error: mkldnn.hpp: No such file or directory
[15:25:05]compilation terminated.
[15:25:05]make[2]: *** [paddle/fluid/platform/CMakeFiles/device_tracer.dir/device_tracer.cc.o] Error 1
[15:25:05]make[1]: *** [paddle/fluid/platform/CMakeFiles/device_tracer.dir/all] Error 2
[15:25:05]make[1]: *** Waiting for unfinished jobs....

#15860 貌似fix 了，@yihuaxu 需要merge下最新。

test=develop

This reverts commit 676995c.

* Revert "Optimze Gelu with MKL Erf function (#15770)" This reverts commit 676995c. * test=develop

yihuaxu added 5 commits January 28, 2019 16:24

Optimize for gelu operator

ab7fc9a

Set up the low accuracy mode of MKL ERF function.

7cbbae8

test=develop

Only enable MKLML ERF when OS is linux

1a895f6

Use the speical mklml version included vmsErf function to verify gelu…

f9850da

… mkl kernel. test=develop

Merge branch 'develop' into develop_a6910f900_gelu_mkl_opt

5f55ede

test=develop

yihuaxu force-pushed the develop_a6910f900_gelu_mkl_opt branch from 76eaa67 to 5f55ede Compare February 18, 2019 07:56

luotao1 added the Intel label Feb 18, 2019

luotao1 requested a review from tensor-tang February 18, 2019 08:35

Add the CUDA macro to avoid NVCC's compile issue.

a4c8ef2

test=develop

luotao1 reviewed Feb 19, 2019

View reviewed changes

Add the TODO comments for mklml library modification.

e1e818e

test=develop

tensor-tang reviewed Feb 19, 2019

View reviewed changes

yihuaxu added 2 commits February 20, 2019 09:51

Clean Code

aed7d9a

test=develop

Add the comment of marco for NVCC compiler.

f887bab

test=develop

tensor-tang reviewed Feb 20, 2019

View reviewed changes

luotao1 requested a review from panyx0718 February 21, 2019 02:00

panyx0718 approved these changes Feb 21, 2019

View reviewed changes

Merge branch 'develop' into develop_a6910f900_gelu_mkl_opt

8944656

test=develop

tensor-tang merged commit 676995c into PaddlePaddle:develop Feb 22, 2019

tensor-tang added a commit that referenced this pull request Feb 22, 2019

Revert "Optimze Gelu with MKL Erf function (#15770)"

70c198a

This reverts commit 676995c.

tensor-tang mentioned this pull request Feb 22, 2019

Revert "Optimze Gelu with MKL Erf function" #15871

Closed

tensor-tang added a commit that referenced this pull request Feb 22, 2019

Revert 15770 develop a6910f9 gelu mkl opt (#15872)

ee2321d

* Revert "Optimze Gelu with MKL Erf function (#15770)" This reverts commit 676995c. * test=develop

yihuaxu mentioned this pull request Feb 26, 2019

Optimize gelu operation with mkl erf #15931

Merged

luotao1 mentioned this pull request Mar 21, 2019

[MKL-DNN] Fully Connected #15226

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimze Gelu with MKL Erf function#15770

Optimze Gelu with MKL Erf function#15770
tensor-tang merged 10 commits intoPaddlePaddle:developfrom
yihuaxu:develop_a6910f900_gelu_mkl_opt

yihuaxu commented Feb 18, 2019 •

edited

Loading

Uh oh!

yihuaxu commented Feb 19, 2019 •

edited

Loading

Uh oh!

luotao1 Feb 19, 2019

Uh oh!

yihuaxu Feb 19, 2019

Uh oh!

tensor-tang Feb 19, 2019

Uh oh!

yihuaxu Feb 20, 2019

Uh oh!

Uh oh!

tensor-tang Feb 19, 2019

Uh oh!

yihuaxu Feb 20, 2019

Uh oh!

tensor-tang left a comment

Uh oh!

yihuaxu commented Feb 21, 2019

Uh oh!

tensor-tang commented Feb 21, 2019 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yihuaxu commented Feb 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yihuaxu commented Feb 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

luotao1 Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

yihuaxu Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

tensor-tang Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

yihuaxu Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tensor-tang Feb 19, 2019

Choose a reason for hiding this comment

Uh oh!

yihuaxu Feb 20, 2019

Choose a reason for hiding this comment

Uh oh!

tensor-tang left a comment

Choose a reason for hiding this comment

Uh oh!

yihuaxu commented Feb 21, 2019

Uh oh!

tensor-tang commented Feb 21, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

yihuaxu commented Feb 18, 2019 •

edited

Loading

yihuaxu commented Feb 19, 2019 •

edited

Loading

tensor-tang commented Feb 21, 2019 •

edited

Loading